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abstract 


Vo 


The  VLSI  implementation  of  a  fussy  logic  inference 
mechanism  allows  the  use  of  rule-based  control  and 
decision  making  in  demanding  real-time  applications 
such  as  robot  control  and  in  the  area  of  command 
and  control  The  full  custom  CMOS  VLSI  is  de¬ 
scribed.  The  chip  is  second  generation  of  the  de¬ 
sign.  It  has  several  design  features  which  make  the 
use  of  this  chip  realistic.  These  features  include  re- 
configurable  auvhitecture,  on-chip  fussification  and 
de-fussification,  and  memory  and  data-path  redun¬ 
dancy.  The  chip  consists  of  614,000  transistors  of 
which  460,000  are  used  for  RAM  memory. 

'  1.  o '  0  j  .  n  <  fo  ^  ^  r  »  ^ 


Caf'TrO 


trO  1 


pro 


). 


1  Introduction 


Fussy  logic  based  control  uses  a  rule-based  expert 
system  paradigm  in  the  area  of  real-time  process 
control  [4|.  It  has  been  used  successfully  in  numer¬ 
ous  areas  including  chemical  process  control,  train 
control  [12j  cement  kiln  control  [2],  and  control  of 
small  aircradt  [5j.  In  order  to  use  this  paradigm  of 
a  fussy  rule-bas^  controller  in  demanding  real-time 
applications,  the  VLSI  implementation  of  the  infer¬ 
ence  mechanism  has  been  an  active  research  topic 
[9,10,ll|.  Potential  applications  of  such  a  VLSI  in¬ 
ference  processor  includes  real-time  decision-making 
in  the  area  of  command  and  control  |3],  control  of 
the  precision  machinery  [l|,  and  robotic  systems  [6|. 

We  have  been  designing  a  second-generation  VLSI 
fussy  logic  inference  engine  on  a  chip.  The  new  archi¬ 
tecture  of  the  inference  processor  has  the  following 


important  improvement  compared  to  previous  work: 

1.  programmable  rule  set  memory 

2.  on-chip  fussifying  operation  -  table  lookup 

3.  on-chip  defussifying  operation  -  center  of  area 
algorithm 

4.  reconfigurable  architecture 

5.  RAM  redundancy  for  higher  yield 

The  original  prototype  experimental  chip  (de¬ 
signed  at  AT&T  Bell  Labs)  had  minimal  logic  on 
chip.  For  example,  it  used  ROM  for  the  rule  set  mem¬ 
ory  which  reduced  its  utility  [lOj.  We  are  now  design¬ 
ing  a  more  realistic  chip  which  nas  RAM  for  the  rule 
set  memory  so  that  rules  can  be  programmable.  In 
addition  to  the  fussy  inference  mechanism,  the  fuszi- 
fying  and  defussifying  operations  are  performed  on 
chip.  The  new  design  has  a  reconfigurable  architec¬ 
ture  such  that  we  can  have  either  51  rules,  4  inputs 
and  2  outputs,  or  102  rules,  2  inputs  and  1  output. 
These  new  design  decisions  render  the  new  architec¬ 
ture  realistic. 


2  Fuzzy  Set  and  Fuzzy  Logic 

Fussy  set  is  based  on  a  generalization  of  the  concept 
of  the  ordinary  set.  In  an  ordinary  set,  we  associate 
a  characterbtic  function  for  each  set.  For  example, 
we  can  define  a  set  S  with  its  characteristic  function 
/,  — ►  {0,  l).  Then,  for  all  e  in  the  universal  set  U, 

e  €  S  if  /,(«)  =  !, 
e  5  if  /,  (e)  =  0. 
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Figure  1:  Approximately  100  km/h. 


Each  element  of  the  universe  either  belongs  to  or  does 
not  belong  to  the  set  S.  In  a  fussy  set,  an  element 
can  be  a  member  of  the  set  with  varying  degree  of 
membership.  The  associated  characteristic  function, 
therefore,  returns  any  real  number  between  0  and  1, 
and  it  is  termed  as  the  membership  function.  For 
a  fussy  set  F,  we  have  an  associated  membership 
function  nr(‘)  (0i  l]-  For  example,  if  element  e 

is  a  member  of  fussy  set  F  with  degree  0.34,  the 
associated  membership  function  returns  this  value, 
|ijr(e)  a  0.34.  If  (e)  =  0,  e  is  entirely  outside  of 
fussy  set  F,  and  if  =  1,  e  is  entirely  inside 

of  fussy  set  F.  Fussy  set  is  represented  by  a  set 
of  ordered  pairs  of  an  element  u,-  and  its  grade  of 
membership: 

where  17  is  a  universe  of  discourse.  Using  a  fussy  set, 
we  can  represent  imprecise  and  vague  concepts  and 
data.  For  example,  approximately  100  km/h  is  repre¬ 
sented  by  the  fussy  set  whose  membership  function  is 
shown  in  Figure  1.  We  can  extend  classical  set  the¬ 
ory  by  defining  basic  set  theoretic  operations  over 
fussy  sets.  The  following  definition  of  intersection 
and  union  with  fussy  sets  are  suggested  by  Zadeh 
[13J.  The  set  theoretic  operations  with  fussy  sets  are 
defined  via  their  membmhip  functions.  Let  A  and 
B  he  a  fiissy  set,  then  union,  intersection  and  com¬ 
plement  of  the  fussy  sets  are  defined  as  follows.  The 
membership  function  of  the  intersection  C  =  An  B 
is  defined  by 

/ic(e)  =  "»»n(Mx(e),MB{e)),  «  €  U. 

The  membership  function  of  the  union  D  =  Au B  is 
defined  by 

PD(e)  =  max(iiA(e),  e  €  U. 

The  membership  function  of  the  complement  -<A  of 
A  is  defined  by 

A*-x(«)  =  1  -  MA(e),  eeU. 

In  the  traditional  logic,  one  of  the  most  important 
inference  rules  is  modus  ponens,  that  is 


Premise 

A  is  true 

Implication 

If  A  then  B 

Conclusion 

B  is  true 

Here,  A  and  B  are  crisply  defined  propositions.  We 
can  construct  a  fuzsy  proposition  using  a  fuzzy  set 
such  as; 

Cuirent  speed  is  approximately  100  km/h. 

By  introducing  fuzzy  propositions  into  modus  po¬ 
nens,  we  can  generalize  modus  ponens.  Let 
C,  C',  D,  D'  be  fuzzy  sets.  Then  the  generalized 
modus  ponens  states: 


Premise 

Implication 

X  is  C" 

If  X  is  C  then  y  is  B 

Conclusion 

y  is  D' 

We  can  use  different  premises  to  arrive  at  different 
conclusions  using  the  same  implication.  For  example. 

Premise 

Implication 

Visibility  is  slightly  low 

If  visibility  is  low 

then  condition  is  poor 

Conclusion 

Condition  is  slightly  poor 

or 

Premise 

Implication 

Visibility  is  very  low 

If  visibility  is  low 

then  condition  is  poor 

Conclusion 

Condition  is  very  poor 

The  above  inference  is  based  on  the  compositional 
rule  of  inference  for  approximate  reasoning  proposed 
by  Zadeh  [14].  Suppose  we  have  two  rules  with  two 
fus^  clauses  in  the  IF-part  and  one  clause  in  the 
THEN-part: 

Rule  1:  If  [x  is  Ail  and  (y  is  Bi)  then  (z  is  Ci), 
Rule  2:  If  (x  is  Aj)  and  (y  is  Bj)  then  (s  is  €2). 

We  can  combine  the  inference  of  the  multiple  rules 
by  assuming  the  rules  are  connected  by  OR  connec¬ 
tive,  that  is  Rule  1  OR  Rule  2  jlOj. 

Given  fusty  proposition  (x  is  A')  and  (y  is  B'), 
weights  a*  and  af  of  clauses  of  premises  are  calcu¬ 
lated  by  ; 

af  —  max(A',  Aj), 

af  =  max(B',  B,),  for  i  =  1, 2. 

Then,  weights  wi  and  u>3  of  the  premises  are  calcu¬ 
lated  by  ; 

tui  =  min(Qif  ,af), 

W2  = 


1 


1 


0 


♦  B* 
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Weight  af  represents  the  closeness  of  proposition  (x 
is  At)  and  proposition  (x  is  A').  Weight  to,-  represents 
similar  measure  for  the  entire  premise  for  the 
rule.  The  conclusion  of  the  first  rule  is 

C[  =  miu(uii,  Cl), 

The  conclusion  of  the  second  rule  is 

C2  —  min(toa,  C3), 

The  overall  conclusion  0’  is  obtained  by 
C'  =  max(Cl,Ci). 

This  inference  process  is  shown  in  Figure  2.  In  this 
example,  =  0.5  and  af  =  0.25,  therefore  wi  = 
0.25.  =  0.85  and  af  =  0.5,  therefore  u>2  =  0.5. 

3  Rule-based  controller 

The  usual  approach  for  automatic  process  control  is 
to  establish  a  mathematical  model  of  the  process. 
However,  this  is  not  always  feasible.  In  some  cases, 
there  is  no  proper  mathematical  model  because  the 
process  is  too  complex  or  ill-understood.  In  other 
cases,  experimenting  with  plants  for  construction  of 
mathematical  models  is  too  expensive.  In  still  other 
cases,  the  mathematical  models  are  too  complicated 
or  computationally  expensive  and  are  not  suitable  for 
real  time  use.  For  such  processes,  however,  skilled 
human  controllers  may  be  able  to  operate  the  plant 
satisfactorily.  The  operators  are  quite  often  able  to 
express  their  operating  practice  in  the  fm-m  of  rules 
which  may  be  used  in  a  rule-based  controller.  The 
rule  based  controllers  model  the  behavior  of  the  ex¬ 
pert  human  operator  instead  of  the  process.  The  fol¬ 
lowing  is  a  rule  from  an  aircraft  flight  controller  [5|. 


This  rule  takes  three  inputs  and  has  two  outputs. 

If  (1)  The  rate  of  descent  is  Positively  Medium, 

(2)  The  airspeed  is  Negatively  Big  (compared 
to  the  desired  airspeed), 

(3)  The  glide  slope  is  Positively  Big  (com¬ 
pared  to  the  desired  slope). 

Then  (1)  change  engine  speed  by  Positively  Big, 
and 

(2)  change  elevator  angle  by  Insignificant 
Change. 

The  es^ressions.  Positively  Medium,  Positively  Big, 
Insignificant  Change,  and  others  represent  imprecise 
amounts.  They  represent  intuitive  feel  of  the  expert 
human  controller.  They  correspond  to  the  imprecise 
expressions  used  by  the  expert  for  communicating  a 
rule  of  thumb.  They  are  represented  by  using  fuzzy 
sets  and  their  associated  membership  functions. 

The  fussy  set,  such  as  Positively  Medium  is  rep¬ 
resented  by  the  membership  function  over  an  appro¬ 
priate  universe  of  discourse  such  as  revolutions  per 
minute  (rpm).  The  possible  definitions  of  fuzzy  sets 
are  shown  in  Figure  3.  The  control  rules  are  en¬ 
coded  using  typically  10  to  70  rules.  The  Control 
is  performed  based  on  the  fussy  inference  mecha¬ 
nism  described  in  Section  2  and  Figure  2.  In  con¬ 
trolling  a  process,  all  of  the  rules  are  compai-ed  to 
the  current  inputs  (observations)  and  fired.  The  ac¬ 
tions  (THEN-part)  of  each  rules  are  weighted  by  how 
close  its  IF-part  matches  the  current  observation.  In 
the  example  of  Figure  3,  a  rule  has  two  inputs  and 
a  single  output.  The  weights  are  represented  by  wi 
and  u>2.  The  results  of  &ng  of  each  rule  are  then 
combined  by  superimposing  them.  The  final  result 
which  is  suppli^  to  a  controller  should  be  a  crisp 
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Figure  3:  Typical  fassy  seta. 


number  rather  than  a  fussy  set,  therefore  we  need  to 
perform  a  defussifying  operation.  This  is  computed 
by  taking  a  center  of  area  under  the  fussy  member¬ 
ship  function  of  the  final  result.  Even  though  each 
individual  rule  is  an  incomplete  rule  of  thumb,  the 
results  of  firing  each  rule  are  properly  weighted  and 
combined  and  the  final  result  represents  reasonable 


compromise. 

In  order  for  VLSI  implementation  of  fussy  infer¬ 
ence  to  be  useful,  a  fair  amount  of  pre-processing 
(fussifying)  and  post-processing  (defussif^g)  must 
be  performed  on  chip.  The  ATIrT  prototype  chip 
assumed  that  both  of  these  processes  are  performed 
by  the  host-processor.  However,  the  inference  pro¬ 
cessing  is  too  fast  for  fussifying  and  defussifying  to 
take  place  off-chip  by  a  host  processor.  This  assump¬ 
tion  burdened  the  host  processor  and  nullified  the 
advantage  of  VLSI  implementation  of  the  inference 
mechanism. 


Chip  Architecture  and  Implemen¬ 
tation 


The  process  controller  system  is  configured  as  in  Fig¬ 
ure  4.  The  VLSI  implementation  is  done  with  four 
components;  a  fussyer,  a  rule  memory,  an  inference 
mechanism,  and  a  defussifier  on  a  single  chip.  Each 
input  and  output  data  item  is  6  bits.  This  fits  well 
with  available  A/D  and  D/A  converters.  In  addi¬ 
tion,  our  chip  will  communicate  with  a  host  proces¬ 
sor.  The  chip  has  three  stage  pipelining  architecture. 
The  pipeline  consists  of  IF-part,  THEN-part,  and  de¬ 
fussifier. 

We  considered  the  sise  of  the  fussy  set  and  the 
grade  of  fussiness  for  practical  use.  In  most  cases,  a 
Fussy  variable  has  three  to  sixteen  elements  and  the 
grade  of  fussiness  has  three  to  twelve  levels  [5,8|.  In 
this  chip  implementation,  the  universe  of  discourse 
of  a  fussy  set  is  a  finite  set  with  64  elements  (i.e.  6 
bits).  The  membership  function  has  16  levels  (ie.  4 
bits).  That  is,  0  represents  no  membership,  15  rep¬ 
resent  full  membership,  and  other  numbers  represent 
points  in  the  unit  interval  [0,  l|.  A  fussy  membership 
function  is,  therefore,  discretised  using  64  numbers 
of  4  bit;  that  is  256  bits  of  memory  storage.  The 
representation  of  a  fussy  set  is  as  foUows: 


Fuzzu  loaie 
centrsller 


Figure  4:  Fussy  logic  controller. 
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Fussifying  is  done  using  a  table  look-up.  For  each 
observation  f^Le.  input  stream),  we  store  a  table  of 
the  membership  function  normalised  at  the  center  of 
the  horisontal  axis.  That  is,  the  full  membership  is 
at  the  center.  According  to  an  input  value,  the  mem¬ 
bership  function  is  shifted.  The  chip  can  produce  64 
different  membership  functions  from  a  single  stored 
pattern.  The  membmhip  function  can  be  associated 
with  a  predicted  measurement  error  of  a  sensor.  If 
we  do  not  need  fussiness  in  the  observed  value,  we 
can  store  a  pulse  function,  that  is  only  one  entry  has 
membership  1  and  all  the  other  entries  have  O's.  The 
result  of  the  fussifying  is  broadcasted  to  all  of  the 
rules.  In  the  actual  chip  implementation,  the  con¬ 
tent  of  the  table  is  not  shifted.  Rather  a  starting 
address  for  table  look-up  is  shifted  according  to  an 
observation  input. 

The  chip  is  re-configurable.  A  control  system  can 
take  four  inputs  and  produce  two  outputs  or  take  two 
inputs  and  produce  one  output  according  to  an  ap¬ 
plication.  With  the  first  configuration,  we  can  have 
51  rules  on  a  single  chip.  E!ach  rule  has  four  clauses 
in  the  IF-part  and  two  actions  in  the  THEN-part. 


If  A  and  B  and  C  and  D 
Then  Do  E,  and 
Do  F. 


With  the  second  configuration,  we  can  execute  102 
rules  using  a  same  data-path.  Each  rule  has  two 
clauses  in  the  IF-part  and  one  action  in  the  THEN- 
part. 


If  A  and  B  Then  Do  E, 
If  C  and  D  Then  Do  F. 
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Figure  5:  Reconfigurable  data>path  for  rule  execution. 


A  datai-pativ  is  assigned  for  each  rule,  therefore  all 
of  51  or  102  rules  are  executed  in  parallel  There 
are  only  two  basic  units;  they  are  a  parallel  mini¬ 
mum  unit  and  a  parallel  serial  unit.  The  former  per¬ 
forms  the  intersection  operation  on  fussy  sets,  and 
the  latter  performs  the  union  operation.  The  con¬ 
figuration  of  the  If-part  of  the  datarpath  is  shown  in 
figure  5.  The  data-path  can  execute  one  rule  with  4 
if-clauses  or  two  rules  with  2  if-clauses.  Four  pairs 
of  min/max  units  compute  the  weight  a’s  for  each 
clause.  The  min  elements  organised  as  a  binary  tree 
compute  weights  to  of  the  premise  which  is  the  min¬ 
imum  of  all  a’s.  In  the  51  rule  configuration,  the 
last  two  minimum  units  compute  the  same  weight 
to,-.  In  the  102  rule  configuration,  streams  of  I’s  are 
supplied  and  these  two  min  elements  behave  as  de¬ 
lay  elements.  The  control  of  configuration  is  done 
by  setting  a  bit  in  the  status  register  from  the  host 
computer.  Oefussifying  is  done  by  computing  a  cen¬ 
ter  of  area  (COA)  under  the  final  membership  func¬ 
tion.  Denoting  the  final  fussy  subset  as  A,  the  COA 
algorithm  computes  the  following: 

.  ^ 

Since  each  element  of  the  universe  is  processed  seri¬ 
ally,  we  can  substitute  multiple  addition  for  multipli¬ 
cation  in  the  above  computation.  The  data  sequence 


from  the  THEN-part  is  produced  starting  from  the 
most  significant  data  point  as  follows: 

MX  (63),  MX  (62),  ...,  Mx(l),  Mx(0). 

Two  adders  and  two  registers  are  used  as  shown  in 
Figure  6.  The  numerator  is  computed  by  the  first 
adder  and  denominator  is  produced  by  the  second 
aulder.  The  denominator  is  computed  as  by  repeated 
addition  of  the  result  of  the  first  adder  by  the  second 
adder  which  computes  the  following  formula. 

n»0 

MX  (63)  + 

fiA  (63)  -I-  MX  (62)  -(- 

Mx(63)  +  Mx(62)-hMx(61)  + 

Mx(63)  -I-  Mx(62)  +  Mx(61)  H - 1-  M/i(0). 

In  order  to  achieve  higher  yield,  we  allocated  51 
data-paths  on  the  chip,  and  non-functioning  memory 
units  and  data-paths  can  be  isolated  from  the  rest  of 
the  chip.  The  isolation  is  achieved  by  blowing  a  fuse 
using  laser  technology.  Each  pair  of  a  memory  unit 
and  a  datarpath  can  be  reprogrammed  to  any  other 


Rul«  Salad 


Figure  7:  Redundancy 
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Figure  6:  Defuisifier  circuit. 


address  also  by  blowing  a  fuse.  This  allows  a  contin¬ 
uous  addressing  of  memory/datarpaths  after  removal 
of  a  defective  unit  from  a  chip.  The  schematic  dia¬ 
gram  for  address  removal  and  re-programming  cir¬ 
cuit  is  shown  in  Figure  7. 

The  host  processor  down  bads  the  rule  set  and  ta¬ 
ble  for  falsification  at  start  up  time.  The  fussy  pro¬ 
cessor  looks  like  a  static  RAM  chip  to  the  host  pro¬ 
cessor.  The  RAM  system,  however,  only  has  a  row 
decodw  and  does  not  have  a  column  decoder.  user 
can  address  each  row  (corresponds  a  clause/action  of 

a  rule)  by  a  memory  address  register.  Each  column 
is  addrened  by  a  shift  register  because  data  are  ac¬ 
cessed  sequentially.  The  last  address  is  reserved  and 
mapped  to  the  status  register.  This  register  con¬ 
trol  the  configuration  of  daUi-paths  and  operational 
modes  (load,  run,  or  test). 

The  chip  is  designed  for  a  1  fim  N-well  CMOS 
process  of  MCNC  [7].  It  uses  non-overlapping  two 
phase  clocking  scheme.  The  chip  is  designed  with 
a  target  operational  speed  of  40MHs.  The  chip  con¬ 
sists  from  approximately  614,000  transistors  of  which 
about  470,000  are  used  to  form  the  static  RAM  sys¬ 
tem.  The  die  sise  is  7750|4m  by  9080^m,  and  is  pack- 
eged  in  a  standard  pin  grid  array  with  64  pins.  The 
supply  voltage  is  3.0-3.3  v. 
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